Improved Modeling in Handwriting Recognition
نویسندگان
چکیده
In this work a script independent handwriting recognition system is proposed which is derived from the RWTH-ASR hidden Markov model (HMM) based speech recognizer. Most problems occurring in handwriting recognition (HWR) are induced by large variations within the written text. In particular, different handwriting styles such as cursive writing or long drawn-out strokes are difficult to model. Common handwriting recognition systems use various preprocessing and feature extraction methods to compensate for the variations in handwriting. Additional approaches have been made using writer adaptive training for writer dependent modeling of the variations. This work uses another approach by dealing with these variations using explicit background modeling, improving the visual models of the handwriting and exploiting the character context within the script. Therefore, only simple appearance based features and only few preprocessing steps have to be used. Instead, methods known from handwriting recognition, such as model length estimation and discriminative training for writer independent modeling as well as common methods from speech recognition are applied. In addition, several approaches for improvement of continuous text line recognition are made by using lexica and language models accounting for the specialties in line recognition. The script independence of the proposed system is demonstrated on Arabic and Latin recognition tasks. The performance is evaluated on the IFN/ENIT database which provides an Arabic single word recognition task. In addition, the IAM database is chosen, providing a Latin text line recognition task. The results obtained on both databases outperform the currently known best error rates achieved with a single recognition system.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملA new hybrid approach to large vocabulary cursive handwriting recognition
This paper presents a novel hybrid modeling technique that is used for the first time in Hidden Markov Modelbased handwriting recognition. This new approach combines the advantages of discrete and continuous Markov models and it is shown that this is especially suitable for modeling the features typically used in handwriting recognition. The performance of this hybrid technique is demonstrated ...
متن کاملOn-Line Handwriting Recognition Using Hidden Markov Models
New global information-bearing features improved the modeling of individual letters, thus diminishing the error rate of an HMM-based on-line cursive handwriting recognition system. This system also demonstrated the ability to recognize on-line cursive handwriting in real time. The BYBLOS continuous speech recognition system, a hidden Markov model (HMM) based recognition system, is applied to on...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملRecognition of Sequence of Print and Ink Strokes: Investigation the Effect of Handwriting Pressure, Hue of Ink, Printer and Paper Type
By introducing of digital techniques, forensic document examiners has been encouraged to work with better accuracy in non-destructive ways. The aim of this study was to present a non-destructive, accessible, economic (affordable), user friendly, portable, useful and easy technique for specifying the order of crossing lines of ink stroke and printed text. The intersections of LaserJet and In...
متن کامل